2018 3 8

Course overview - 1Q

  • 1st week: Introduction to R and RStudio
  • 2nd week: Data structure
  • 3rd week: Graph I
  • 4th week: Data manipulation I
  • 5th week: Data manipulation II
  • 6th week: Graph II
  • 7th week: RMarkdown
  • 8th week: Midterm exam

Course overview - 2Q

  • 9th week: Basic statistics I
  • 10th week: Basic statistics II
  • 11th week: Graph III
  • 12th week: Artificial Intelligence
  • 13th week: Presentation of Project
  • 14th week: Git
  • 15th week: Github
  • 16th week: Final exam

Drawing

Drawing

Prologue

Why R ?

  • A language and environment for statistical computing and graphics
  • Allow the user to program algorithms and use libraries programmed by others
  • Cool Graphics!
  • Easy access to uptodate statistical methods
  • Reproducible research
  • Life will be easier

The 2017 Top Programming Languages - IEEE Spectrum

RStudio

  • GUI for R

  • Free!

  • Easy to use

Drawing

Installation

R

Drawing

RStudio

Drawing

Rstudio

  • Make new file> file
  • Comment/Uncomment> code
  • Set working directory> session
  • packages> tools

  • Make RMarkdown file: later

You can use python or bash

for i in [1, 2, 3, 4, 5]:
  print(i)
## 1
## 2
## 3
## 4
## 5
pwd
python --version
## /Users/kwangyeolpark/Dropbox/WorkingWithMyself/강의/SW중심대학/2018_1Q2Q_lecture
## Python 2.7.10

Simple calculation in R

1 + 3
## [1] 4
a <- c(100, 234, 356, 477, 888)
mean(a)
## [1] 411
sd(a)
## [1] 301.2308

Simple plot in R

qplot(wt, mpg, data = mtcars)

Another plot in R

ggplot(mtcars, aes(x = hp, y = mpg)) +
  geom_point(aes(color=factor(gear))) + facet_wrap( ~ cyl)

Another plot in R

Demo

demo(graphics)

Combo chart in R

CityPopularity$Mean=mean(CityPopularity$Popularity)
CC <- gvisComboChart(CityPopularity, xvar='City',
                     yvar=c('Mean', 'Popularity'),
                     options=list(seriesType='bars',
                                  width=450, height=300,
                                  title='City Popularity',
                                  series='{0: {type:"line"}}'))
plot(CC)

Combo chart in R

Drawing

Drawing

Drawing

Drawing

R Markdown

Drawing

Reproducible Research

Drawing

Doing Research

  • Collecting and cleaning data
    • MS-EXCEL or MS-ACCESS
    • SPSS
    • txt file (CSV, comma-separated values)
  • Analysis
    • SPSS or SAS
    • R
  • Writing
    • MS-WORD
    • LATEX
    • HTML

Problems

  • Modification of data
    • Addition of new data
    • Error correction in dataset after analysis
  • The connection between dataset and tables/graph might be broken easily.

  • Writing methods section based on the analysis you did 3 months ago.

  • Repetitive analyses are boring!

Case

Hi Dr. Park,

I have starting working on GPBB manuscript (ASH as well).

I need a paragraph from you describing the statistical methodology you used when you analyzed the data for the ISC abstract a few years ago.

Can you also send me the list of final study cohort (300 patients) to me?

Thanks,

Drawing

Reproducible research matters

  • Cleaning data + Analysis + Writing

  • Combining tool
    • R (and Rstudio) + LaTeX or Markdown
  • Output
    • PDF, HTML, MS-WORD